Language-Based Image Editing with Recurrent Attentive Models
نویسندگان
چکیده
We investigate the problem of Language-Based Image Editing (LBIE) in this work. Given a source image and a natural language description, we want to generate a target image by editing the source image based on the description. We propose a generic modeling framework for two sub-tasks of LBIE: language-based image segmentation and image colorization. The framework uses recurrent attentive models to fuse image and language features. Instead of using a fixed step size, we introduce for each region of the image a termination gate to dynamically determine in each inference step whether to continue extrapolating additional information from the textual description. The effectiveness of the framework has been validated on three datasets. First, we introduce a synthetic dataset, called CoSaL, to evaluate the end-to-end performance of our LBIE system. Second, we show that the framework leads to state-of-theart performance on image segmentation on the ReferIt dataset. Third, we present the first language-based colorization result on the Oxford-102 Flowers dataset, laying the foundation for future research.
منابع مشابه
Attentive Language Models
In this paper, we extend Recurrent Neural Network Language Models (RNN-LMs) with an attention mechanism. We show that an Attentive RNN-LM (with 14.5M parameters) achieves a better perplexity than larger RNN-LMs (with 66M parameters) and achieves performance comparable to an ensemble of 10 similar sized RNN-LMs. We also show that an Attentive RNN-LM needs less contextual information to achieve s...
متن کاملInner Attention based Recurrent Neural Networks for Answer Selection
Attention based recurrent neural networks have shown advantages in representing natural language sentences (Hermann et al., 2015; Rocktäschel et al., 2015; Tan et al., 2015). Based on recurrent neural networks (RNN), external attention information was added to hidden representations to get an attentive sentence representation. Despite the improvement over nonattentive models, the attention mech...
متن کاملRecurrent Highway Networks with Language CNN for Image Captioning
Language models based on recurrent neural networks have dominated recent image caption generation tasks. In this paper, we introduce a language CNN model which is suitable for statistical language modeling tasks and shows competitive performance in image captioning. In contrast to previous models which predict next word based on one previous word and hidden state, our language CNN is fed with a...
متن کاملIrony Detection with Attentive Recurrent Neural Networks
Automatic Irony Detection refers to making computer understand the real intentions of human behind the ironic language. Much work has been done using classic machine learning techniques applied on various features. In contrast to sophisticated feature engineering, this paper investigates how the deep learning can be applied to the intended task with the help of word embedding. Three different d...
متن کاملDRAW: A Recurrent Neural Network For Image Generation
This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation. DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with a sequential variational auto-encoding framework that allows for the iterative construction of complex images. The system substantially improves on the state of the art for ge...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1711.06288 شماره
صفحات -
تاریخ انتشار 2017